&#34;Coding techniques for coded block parameters of blocks of macroblocks&#34;

ABSTRACT

The coded block parameters used to code blocks of image samples into structures called macroblocks are compressed more efficiently by exploiting the correlation between chrominance and luminance blocks in each macroblock. In particular, the coded block pattern for chrominance and luminance are combined into a single parameter for the macroblock and jointly coded with a single variable length code. To further enhance coding efficiency, the spatial coherence of coded block patterns can be exploited by using spatial prediction to compute predicted values for coded block pattern parameters.

TECHNICAL FIELD

[0001] The invention relates to video coding, and specifically, to animproved method for coding block parameters used in frame-based andobject-based video coding formats.

BACKGROUND

[0002] Full-motion video displays based upon analog video signals havelong been available in the form of television. With recent advances incomputer processing capabilities and affordability, full-motion videodisplays based upon digital video signals are becoming more widelyavailable. Digital video systems can provide significant improvementsover conventional analog video systems in creating, modifying,transmitting, storing, and playing full-motion video sequences.

[0003] Digital video displays include large numbers of image frames thatare played or rendered successively at frequencies of between 30 and 75Hz. Each image frame is a still image formed from an array of pixelsbased on the display resolution of a particular system. As examples,VHS-based systems have display resolutions of 320×480 pixels, NTSC-basedsystems have display resolutions of 720×486 pixels, and high-definitiontelevision (HDTV) systems under development have display resolutions of1360×1024 pixels.

[0004] The amounts of raw digital information included in videosequences are massive. Storage and transmission of these amounts ofvideo information is infeasible with conventional personal computerequipment. Consider, for example, a digitized form of a relatively lowresolution VHS image format having a 320×480 pixel resolution. Afull-length motion picture of two hours in duration at this resolutioncorresponds to 100 gigabytes of digital video information. Bycomparison, conventional compact optical disks have capacities of about0.6 gigabytes, magnetic hard disks have capacities of 1-2 gigabytes, andcompact optical disks under development have capacities of up to 8gigabytes.

[0005] To address the limitations in storing or transmitting suchmassive amounts of digital video information, various video compressionstandards or processes have been established, including MPEG-1, MPEG-2,and H.26X. These video compression techniques utilize similaritiesbetween successive image frames, referred to as temporal or interframecorrelation, to provide interframe compression in which motion data anderror signals are used to encode changes between frames.

[0006] In addition, the conventional video compression techniquesutilize similarities within image frames, referred to as spatial orintraframe correlation, to provide intraframe compression in which theimage samples within an image frame are compressed. Intraframecompression is based upon conventional processes for compressing stillimages, such as discrete cosine transform (DCT) encoding. This type ofcoding is sometimes referred to as “texture” or “transform” coding. A“texture” generally refers to a two-dimensional array of image samplevalues, such as an array of chrominance and luminance values or an arrayof alpha (opacity) values. The term “transform” in this context refersto how the image samples are transformed into spatial frequencycomponents during the coding process. This use of the term “transform”should be distinguished from a geometric transform used to estimatescene changes in some interframe compression methods.

[0007] Interframe compression typically utilizes motion estimation andcompensation to encode scene changes between frames. Motion estimationis a process for estimating the motion of image samples (e.g., pixels)between frames. Using motion estimation, the encoder attempts to matchblocks of pixels in one frame with corresponding pixels in anotherframe. After the most similar block is found in a given search area, thechange in position of the pixel locations of the corresponding pixels isapproximated and represented as motion data, such as a motion vector.Motion compensation is a process for determining a predicted image andcomputing the error between the predicted image and the original image.Using motion compensation, the encoder applies the motion data to animage and computes a predicted image. The difference between thepredicted image and the input image is called the error signal. Sincethe error signal is just an array of values representing the differencebetween image sample values, it can be compressed using the same texturecoding method as used for intraframe coding of image samples.

[0008] Although differing in specific implementations, the MPEG-1,MPEG-2, and H.26X video compression standards are similar in a number ofrespects. The following description of the MPEG-2 video compressionstandard is generally applicable to the others.

[0009] MPEG-2 provides interframe compression and intraframe compressionbased upon square blocks or arrays of pixels in video images. A videoimage is divided into image sample blocks called macroblocks havingdimensions of 16×16 pixels. In MPEG-2, a macroblock comprises fourluminance blocks (each block is 8×8 samples of luminance (Y)) and twochrominance blocks (one 8×8 sample block each for Cb and Cr).

[0010] In MPEG-2, interframe coding is performed on macroblocks. AnMPEG-2 encoder performs motion estimation and compensation to computemotion vectors and block error signals. For each block M_(N) in an imageframe N, a search is performed across the image of a next successivevideo frame N+1 or immediately preceding image frame N−1 (i.e.,bi-directionally) to identify the most similar respective blocks M_(N+1)or MN-I. The location of the most similar block relative to the blockM_(N) is encoded with a motion vector (DX,DY). The motion vector is thenused to compute a block of predicted sample values. These predictedsample values are compared with block M_(N) to determine the block errorsignal. The error signal is compressed using a texture coding methodsuch as discrete cosine transform (DCT) encoding.

[0011] Object-based video coding techniques have been proposed as animprovement to the conventional frame-based coding standards. Inobject-based coding, arbitrary shaped image features are separated fromthe frames in the video sequence using a method called “segmentation.”The video objects or “segments” are coded independently. Object-basedcoding can improve the compression rate because it increases theinterframe correlation between video objects in successive frames. It isalso advantageous for variety of applications that require access to andtracking of objects in a video sequence.

[0012] In the object-based video coding methods proposed for the MPEG-4standard, the shape, motion and texture of video objects are codedindependently. The shape of an object is represented by a binary oralpha mask that defines the boundary of the arbitrary shaped object in avideo frame. The motion of an object is similar to the motion data ofMPEG-2, except that it applies to an arbitrary-shaped image of theobject that has been segmented from a rectangular frame. Motionestimation and compensation is performed on blocks of a “video objectplane” rather than the entire frame. The video object plane is the namefor the shaped image of an object in a single frame.

[0013] The texture of a video object is the image sample information ina video object plane that falls within the object's shape. Texturecoding of an object's image samples and error signals is performed usingsimilar texture coding methods as in frame-based coding. For example, asegmented image can be fitted into a bounding rectangle formed ofmacroblocks. The rectangular image formed by the bounding rectangle canbe compressed just like a rectangular frame, except that transparentmacroblocks need not be coded. Partially transparent blocks are codedafter filling in the portions of the block that fall outside theobject's shape boundary with sample values in a technique called“padding.”

[0014] Frame-based coding techniques such as MPEG-2 and H26X andobject-based coding techniques proposed for MPEG-4 are similar in thatthey perform intraframe and interframe coding on macroblocks. Eachmacroblock includes a series of overhead parameters that provideinformation about the macroblock. As an example, FIG. 1 shows macroblockparameters used in the header of an interframe macroblock. The CODparameter (10) is a single bit indicating whether the interframemacroblock is coded. In particular, this bit indicates whether or notthe encoded macroblock includes motion data and texture coded errordata. In cases where the motion and error signal data are zero, the CODbit reduces the information needed to code the macroblock because only asingle bit is sent rather than additional bits indicating that themotion vector and texture data are not coded.

[0015] In addition to the COD bit, the coding syntax for macroblocksincludes coded block parameters (CBP) indicating whether the codedtransform coefficients for chrominance and luminance are transmitted forthe macroblock. If the transform coefficients are all zero for a block,then there is no need to send texture data for the block. The CodedBlock Parameters for chrominance (CBPC) are two bits indicating whetheror not coded texture data is transmitted for each of the two chrominanceblocks.

[0016] The CBPC bits are encoded along with another flag that providesinformation about the type of quantization for the macroblock. Theseflags are combined to form a parameter called MCBPC (12), and MCBPC isentropy coded using an entropy coding method such as Huffman orarithmetic coding.

[0017] The parameter called the AC_Pred_flag (14) is a flag indicatingwhether AC prediction is used in the macroblock.

[0018] The Coded Block Pattern for luminance (CBPY) (16) is comprised offour bits indicating whether or not coded texture data is transmittedfor each of the four luminance blocks. Like the MCBPC parameter, theCBPY flags are also entropy coded using either Huffman or arithmeticcoding.

[0019] After the CBPY parameter, the macroblock includes encoded motionvector data (shown as item 18 in FIG. 1). Following the motion vectordata, the “block data” represents the encoded texture data for themacroblock (shown as block data 20 in FIG. 1).

[0020] One drawback of the coding approach illustrated in FIG. 1 is thatit codes CBPC and CBPY flags separately, and therefore, does not exploitthe correlation between these parameters to reduce the macroblockoverhead. In addition, it does not take advantage of the spatialdependency of the coded block parameters.

SUMMARY

[0021] The invention provides an improved method of coding themacroblock header parameters in video coding applications. One aspect ofthe invention is a coding method that exploits the correlation betweenthe coded block parameters by jointly coding all of the coded blockparameters with a single variable length code. Another aspect of theinvention is a coding method that takes advantage of the spatialdependency between the coded block patterns of neighboring blocks.

[0022] In an implementation of the invention, the coded block parametersfor luminance and chrominance in a macroblock are formed into a single,combined parameter for the macroblock. The combined parameter isassigned a variable length code from a variable length coding table. Thecoding table is trained based on a target bit rate (e.g., low bit rateInternet applications) and a target class of video content (e.g.,talking head video). By jointly coding the luminance and chrominancevalues, the encoder exploits the correlation between these parameters inthe macroblock.

[0023] To improve the coding efficiency further, the implementation usesprediction to take advantage of the spatial dependency of the codedblock parameters of neighboring blocks. Before assigning the variablelength code to the combined parameter, some of the coded blockparameters are predicted from neighboring blocks. For intra framemacroblocks, for example, the encoder computes a spatially predictedvalue for each coded block parameter for luminance. This spatiallypredicted parameter forms part of the combined parameter for themacroblock.

[0024] Additional features and advantages of the invention will becomemore apparent from the following detailed description and accompanydrawings of an implementation of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0025]FIG. 1 is a diagram illustrating an example of a macroblock headerused in a standard video coding process.

[0026]FIG. 2 is a block diagram of a video coder.

[0027]FIG. 3 is a block diagram of a video decoder.

[0028]FIG. 4 is a diagram illustrating an example of an improvedmacroblock header in which the coded block parameters for chrominanceand luminance are jointly coded with a single variable length code.

[0029]FIG. 5 is a flow diagram illustrating how an implementation of theinvention computes a single variable length code for the coded blockparameters of I and P frame macroblocks.

[0030]FIG. 6 is a diagram illustrating four macroblocks, and theircorresponding luminance (Y) blocks.

[0031]FIG. 7 is a diagram showing an example of the vertical andhorizontal gradients of coded block parameter values for selectedluminance blocks in FIG. 6.

[0032]FIG. 8 is a flow diagram illustrating a method for computing apredictor for coded block parameters.

[0033]FIG. 9 is a diagram of a computer system that serves as anoperating environment for a software implementation of the invention.

DETAILED DESCRIPTION

[0034] Introduction

[0035] The first section below provides a description of a video encoderand decoder. Subsequent sections describe how to improve the coding ofmacroblock header parameters by exploiting the correlation between CBPCand CBPY parameters and taking advantage of the spatial dependency ofcoded block parameters of neighboring blocks.

[0036] Useful in both frame-based and object-based video coding, theinvention improves the coding of macroblock parameters, whether themacroblocks are components of arbitrary video objects segmented from asequence of frames or of rectangular shaped image frames. Object-basedcoding uses similar motion and texture coding modules as used inframe-based coding. In addition, object-based coders also include shapecoding modules. The block syntax relevant to the invention is similar inboth frame-based and object-based coding. While the encoder and decoderdescribed in the next section are object-based, they provide asufficient basis for explaining how to implement the invention in bothframe-based and object-based coding schemes.

[0037] Description of an Example Encoder and Decoder

[0038]FIG. 2 is a block diagram illustrating an implementation of anobject-based video encoder. The input 30 to the encoder includes aseries of objects, their shape information and bounding rectangles. Theshape information, therefore, is available before the encoder codestexture or motion data. Frame-based coding differs in that the entireframe is coded without shape information.

[0039] The shape coding module 32 receives the definition of an objectincluding its bounding rectangle and extends the bounding rectangle tointeger multiples of macroblocks. The shape information for an objectcomprises a mask or “alpha plane.”

[0040] The shape coding module 32 reads this mask and compresses it,using for example, a conventional chain coding method to encode thecontour of the object.

[0041] Motion estimation module 34 reads an object including itsbounding rectangle and a previously reconstructed image 36 and computesmotion estimation data used to predict the motion of the object from oneframe to another. After identifying the macroblocks in the currentobject image, the motion estimation module 34 searches for the mostsimilar macroblock in the reconstructed image for each macroblock in thecurrent object image to compute the motion data for each macroblock. Thespecific format of the motion data from the motion estimation module 34can vary depending on the motion estimation method used. Theimplementation described below computes a motion vector for eachmacroblock, which is consistent with current MPEG and H26X formats.

[0042] The motion compensation module 38 reads the motion vectorscomputed by the motion estimation module and the previouslyreconstructed image 36 and computes a predicted image for the currentframe. The encoder finds the difference between the image sample valuesin the input image block as specified in the input 30 and thecorresponding sample values in the predicted image block as computed inthe motion compensation module 38 to determine the error signal for themacroblock.

[0043] Texture coding module 40 compresses this error signal forinter-frame coded objects and compresses image sample values for theobject from the input data stream 30 for intra-frame coded objects. Thefeedback path 42 from the texture coding module 40 represents thedecoded error signal. The encoder uses the error signal macroblocksalong with the predicted image macroblocks from the motion compensationmodule to compute the previously reconstructed image 36.

[0044] The texture coding module 40 codes blocks of intra-frame anderror signal data for an object using any of a variety of still imagecompression techniques. Example compression techniques includetransform-based techniques such as DCT and wavelet coding as well asother conventional image compression methods such as LaPlacian Pyramidcoding.

[0045] The bitstream of the compressed video sequence includes theshape, motion and texture coded information from the shape coding,motion estimation, and texture coding modules. Multiplexer 44 combinesand formats this data into the proper syntax and outputs it to thebuffer 46.

[0046] While the encoder can be implemented in hardware or software, itis most likely implemented in software. In a software implementation,the modules in the encoder represent software instructions stored inmemory of a computer and executed in the processor and the video datastored in memory. A software encoder can be stored and distributed on avariety of conventional computer readable media. In hardwareimplementations, the encoder modules are implemented in digital logic,preferably in an integrated circuit. Some of the encoder functions canbe optimized in special-purpose digital logic devices in a computerperipheral to off-load the processing burden from a host computer.

[0047]FIG. 3 is a block diagram illustrating a decoder for anobject-based video coding method. A demultiplexer 60 receives abitstream 62 representing a compressed video sequence and separatesshapes, motion and texture encoded data on an object by object basis.Shape decoding module 64 decodes the shape or contour for the currentobject being processed. To accomplish this, it employs a shape decoderthat implements the inverse of the shape encoding method used in theencoder of FIG. 2. The resulting shape data is a mask, such as a binaryalpha plane or gray scale alpha plane representing the shape of theobject.

[0048] The motion decoding module 66 decodes the motion information inthe bitstream. The decoded motion information includes motion data suchas motion vectors for macroblocks blocks or geometric transformcoefficients, depending on the type of estimation method used in theencoder. The motion decoding module 66 provides this motion informationto the motion compensation module 68, and the motion compensation module68 applies the motion data to previously reconstructed object data 70.

[0049] The texture decoding module 74 decodes error signals forinter-frame coded texture data and an array of color values forintra-frame texture data and passes this information to a module 72 forcomputing and accumulating the reconstructed image. For inter-framecoded objects, this module 72 applies the error signal data to thepredicted image output from the motion compensation module to computethe reconstructed object for the current frame. For intra-frame codedobjects the texture decoding module 74 decodes the image sample valuesfor the object and places the reconstructed object in the reconstructedobject module 72. Previously reconstructed objects are temporarilystored in object memory 70 and are used to construct the object forother frames.

[0050] Like the encoder, the decoder can be implemented in hardware,software or a combination of both. In software implementations, themodules in the decoder are software instructions stored in memory of acomputer and executed by the processor and video data stored in memory.A software decoder can be stored and distributed on a variety ofconventional computer readable media. In hardware implementations, thedecoder modules are implemented in digital logic, preferably in anintegrated circuit. Some of the decoder functions can be optimized inspecial-purpose digital logic devices in a computer peripheral tooff-load the processing burden from a host computer.

[0051] Improved Coding of Macroblock Overhead

[0052] The invention includes innovations that improve the coding ofmacroblock header parameters. One innovation is a method for coding thecoded block parameters to exploit the correlation between CBPC and CBPY.This innovation is implemented by jointly coding a combined CBPC andCBPY parameter with a single variable length code. Another innovationfurther improves coding efficiency of the header parameters byexploiting the spatial dependency of the coded block parameters. Inparticular, coded block parameters are more efficiently compressed bypredicting them from the parameter of neighboring blocks.

[0053]FIG. 4 is a diagram illustrating the header block parameterscomputed by an implementation of the invention. Like the headerinformation shown in FIG. 1, this header block includes a COD parameter80, an AC_Pred_flag 82, motion vector data (MV 84) and block data 86.Unlike the header in FIG. 1, MCBPC and CBPY parameters are jointly codedwith a single variable length code, called MBCBPCY 88. This codecombines coded block parameters for chrominance and luminance, as wellas the flag for macroblock type.

[0054]FIG. 5 is a flow diagram illustrating how the implementationgenerates a variable length code for Intra (I) frames and predicted (P)frames. In this particular implementation, the header blocks for I and Pframes are coded differently. For I frames, the encoder performs theadditional step of predicting the coded block parameters for luminancebefore selecting the variable length code. It is also possible to useprediction for P frames. However, prediction does not improve codingefficiency significantly in P frames, and in some cases, can evendecrease coding efficiency.

[0055] The goal of using prediction for coded block parameters is toproduce as many zero values for these parameters as possible. By makingthe values mostly zero, the encoder reduces the variance of the codedblock parameters. The process of training the variable length codingtable can then favor the zero value, which improves coding efficiency.In P frames, especially in low bit rate applications, the coded blockparameters are mostly zero before prediction. As such, prediction doesnot tend to increase the number of zero values, and sometimes, it evendecreases the number of zero values. Therefore, the implementation shownin FIG. 5 does not use prediction for P frames.

[0056] For P frames, the encoder begins by finding the coded blockparameters for luminance and chrominance as shown in step 100. Theseblock parameters are each a single bit indicating whether acorresponding block is texture coded. The coded block parameters arecomputed in the texture coding module (40 in FIG. 2), which sets a codedblock flag for each block that has non-zero encoded texture values.Conversely, the value of the coded block parameter for a block in whichthe texture values are all zero (or so close to zero as to benegligible) is zero.

[0057] Since there are two blocks for chrominance (one each for the 8 by8 pixel U and V blocks) and four blocks for luminance (one each for thefour 8 by 8 blocks) in the macroblock, the combined parameter for thecoded block pattern is a total of six bits. Combining this 6 bit numberwith the single bit for macroblock type, the encoder forms a 7 bitnumber as shown in step 102. The macroblock type indicates whether themacroblock is for an I or P frame.

[0058] Once the combined MBCBPCY is formed, the combined parameter islooked up in a variable length coding table to find a correspondingvariable length code associated with the parameter as shown in step 104.The encoder assigns a single variable length code to the combinedparameter MBCPCY.

[0059] The coding table in the implementation table is a Huffman codingtable. The table is preferably trained based on the target rate andtarget scenario. Table 1 below is a Variable Length Coding (VLC) tableobtained for a low bit rate “talking head” scenario. For each macroblockin a P frame, the combined MBCBPCY information is coded using the codeword for the corresponding entry in this table. TABLE 1 VLC Table forCoded block pattern of chrominance and luminance for P picture MB CBPCYIndex type Y (1234) UV Number of bits Code 0 I 000000 7 1000000 1 I000001 13 1001111001001 2 I 000010 12 100111111101 3 I 000011 15000000111111100 4 I 000100 12 100111111100 5 I 000101 18000000101010000011 6 I 000110 17 10010110100110100 7 I 000111 161000001110111100 8 I 001000 12 100000111010 9 I 001001 1700000011111111000 10 I 001010 16 0000001111111101 11 I 001011 160000001111111111 12 I 001100 13 0000001111001 13 I 001101 18000000101010000010 14 I 001110 16 1001011010011101 15 I 001111 160000001010100100 16 I 010000 12 100101111000 17 I 010001 1700000010101000011 18 I 010010 15 100000111011111 19 I 010011 1700000011111111001 20 I 010100 13 1001011110011 21 I 010101 18100101101001101011 22 I 010110 18 100101111011111001 23 I 010111 160000001111111010 24 I 011000 14 10000011101110 25 I 011001 201001011010011010101 1 26 I 011010 16 1001011010011100 27 I 011011 18100101111011111000 28 I 011100 13 1001011010010 29 I 011101 18000000101010000101 30 I 011110 16 1001011010011110 31 I 011111 15100101111001000 32 I 100000 12 000000111101 33 I 100001 1710010111101111111 34 I 100010 16 0000001010100010 35 I 100011 161001011010011111 36 I 100100 14 10010111101110 37 I 100101 211001011010011010101 01 38 I 100110 17 10010111101111101 39 I 100111 1710010111101111110 40 I 101000 12 100111100101 41 I 101001 18000000101010000001 42 I 101010 19 1001011010011010100 43 I 101011 161000001110111101 44 I 101100 13 0000001111000 45 I 101101 161001011010011011 46 I 101110 16 0000001111111110 47 I 101111 160000001010100101 48 I 110000 13 0000001111110 49 I 110001 18000000101010000000 50 I 110010 16 0000001010100011 51 I 110011 160000001111111011 52 I 110100 13 1000001110110 53 I 110101 18000000101010000100 54 I 110110 15 000000101010011 55 I 110111 15100101111001001 56 I 111000 13 0000001010101 57 I 111001 211001011010011010101 00 58 I 111010 15 100101111011110 59 I 111011 1410010111100101 60 I 111100 10 1001011011 61 I 111101 15 10010110100110062 I 111110 12 100101101011 63 I 111111 12 100101101010 64 P 000000 2 0165 P 000001 7 0000000 66 P 000010 6 100110 67 P 000011 9 100101011 68 P000100 3 111 69 P 000101 10 1000001111 70 P 000110 9 000000100 71 P000111 12 000000101000 72 P 001000 3 110 73 P 001001 10 1000001010 74 P001010 9 100101000 75 P 001011 12 000000101011 76 P 001100 5 10001 77 P001101 11 00000011011 78 P 001110 9 100111010 79 P 001111 11 1001111111180 P 010000 4 0011 81 P 010001 10 1001110111 82 P 010010 9 100000110 83P 010011 12 100000111001 84 P 010100 4 1011 85 P 010101 10 1001111011 86P 010110 9 100101100 87 P 010111 11 10010111111 88 P 011000 6 001001 89P 011001 12 000000110101 90 P 011010 10 1001111110 91 P 011011 131001111001000 92 P 011100 6 000001 93 P 011101 11 10010101010 94 P011110 10 1000001000 95 P 011111 12 000000101001 96 P 100000 4 0001 97 P100001 10 1001010100 98 P 100010 9 100101110 99 P 100011 12 100000111000100 P 100100 6 100100 101 P 100101 11 10011110011 102 P 100110 101001110110 103 P 100111 13 1001011110110 104 P 101000 5 00001 105 P101001 10 1001111010 106 P 101010 9 100111110 107 P 101011 12000000111110 108 P 101100 6 001000 109 P 101101 11 10000010011 110 P101110 10 0000001100 111 P 101111 11 10010111110 112 P 110000 5 10100113 P 110001 11 10000010010 114 P 110010 10 1001010011 115 P 110011 12100101111010 116 P 110100 6 100001 117 P 110101 11 10010101011 118 P110110 10 1000001011 119 P 110111 12 000000110100 120 P 111000 5 10101121 P 111001 10 1001111000 122 P 111010 10 1001010010 123 P 111011 12100101101000 124 P 111100 5 00101 125 P 111101 10 0000001011 126 P111110 8 10011100 127 P 111111 10 0000001110

[0060] In the implementation shown in FIG. 5, I frames are codeddifferently than P frames in that the encoder uses prediction to exploitthe spatial dependency of the coded block parameters. For eachmacroblock, the encoder begins by getting the coded block parameters forchrominance and luminance as shown in step 106.

[0061] Next, the encoder computes the predictor of the coded blockparameters for luminance. In this particular implementation, the encoderonly uses prediction for the CBPY parameters. However, the sameprediction method could also be used to predict the coded blockparameters for chrominance. In the case of chrominance, the predictionis computed based on 8 by 8 pixel chrominance blocks in neighboringmacroblocks rather than the neighboring 8 by 8 pixel luminance blocks,which may be in the same macroblock or a neighboring macroblock. Sinceeach macroblock has four luminance blocks, the neighboring blocks for agiven luminance block may come from the same or neighboring macroblock.For prediction involving chrominance blocks, the neighboring blocks comefrom neighboring macroblocks.

[0062] The encoder performs spatial prediction on coded blockparameters. First, it looks at the coded block parameters forneighboring blocks to determine whether the value of the block parameteris likely to change from a neighboring block to the current block ofinterest. If the location of a block representing the smallest change inthe coded block parameter can be identified (i.e. the lowest spatialgradient in the coded block parameters), then the coded block parameterfor the block at this location is used as the predictor. Otherwise, itdoes not matter which neighbor is chosen as the predictor and one ismerely selected. A specific example of selecting the predictor isdescribed and illustrated in more detail with reference to FIGS. 6-8below.

[0063] In the next step 110, the encoder computes a predicted value forthe coded block parameters. The predicted value represents the change inthe coded block parameter for the predictor block and the current block.To compute the predicted value, the encoder performs a bitwise exclusiveOR (XOR) on the predicted value and current block value. The resultingvector, called CBPCY_XOR is then assigned a variable length code from aHuffman table. The encoder looks up the entry for CPCY_XOR in the tableand finds the corresponding variable length code. Table 2 below showsthe VLC table used to code predicted CBPCY values for I frames in theimplementation. TABLE 2 VLC Table for Coded block pattern of cbrominanceand luminance for I picture Index CBPCY_XOR Y(1234) UV Number of bitsCode 0 000000 1 1 1 000001 6 010111 2 000010 5 01001 3 000011 5 00101 4000100 5 00110 5 000101 9 001000111 6 000110 7 0100000 7 000111 70010000 8 001000 5 00010 9 001001 9 001111100 10 001010 7 0111010 11001011 7 0011101 12 001100 6 000010 13 001101 9 011101100 14 001110 801110111 15 001111 8 00000000 16 010000 5 00011 17 010001 9 010110111 18010010 7 0101100 19 010011 7 0010011 20 010100 6 000001 21 010101 100101101000 22 010110 8 01000110 23 010111 8 00111111 24 011000 6 01111025 011001 13 0011100010010 26 011010 9 010110101 27 011011 8 01000010 28011100 7 0100010 29 011101 11 00111000101 30 011110 10 0100011110 31011111 9 010000111 32 100000 4 0110 33 100001 9 000000011 34 100010 70011110 35 100011 6 011100 36 100100 7 0010010 37 100101 12 00111000100038 100110 9 001000100 39 100111 9 001110000 40 101000 6 011111 41 10100111 01000111110 42 101010 8 00111001 43 101011 9 010001110 44 101100 70000001 45 101101 11 00111000110 46 101110 9 010110110 47 101111 9001000101 48 110000 6 010100 49 110001 11 01000111111 50 110010 9001111101 51 110011 9 000011000 52 110100 7 0000111 53 110101 1100111000111 54 110110 9 010000110 55 110111 9 000011001 56 111000 6010101 57 111001 10 0111011011 58 111010 9 000000010 59 111011 9001000110 60 111100 8 00001101 61 111101 13 0011100010011 62 111110 100111011010 63 111111 10 0101101001

[0064] FIGS. 6-8 illustrate the spatial prediction performed in theencoder in more detail. FIG. 6 is a diagram showing four neighboringmacroblocks (top left-120, top right-122, lower left-124, and lowerright-126). The following example focuses on the lower right block,which is circled. Each of the macroblocks includes four 8 by 8 pixelblocks for luminance labeled as Y1, Y2, Y3 and Y4.

[0065] As an example, consider the top left luminance block Y1 formacroblock 126. The blocks used to compute the predictor are surroundedby a dashed line 128. The block of interest is Y1 (labeled as block 130a), and the blocks used to compute the predictor are the neighboringblocks labeled as 132 a, 134 a, and 136 a.

[0066] To give a specific example, FIG. 7 shows values of the codedblock pattern parameters for each of the blocks within the dashed lineof FIG. 6. The reference numbers 130 b, 132 b, 134 b and 136 bcorrespond to the blocks 130 a, 132 a, 134 a and 136 a of FIG. 6,respectively. The spatial gradients of the neighboring coded blocksparameters are used to select the predictor. In particular, the verticalgradient is computed from the coded block parameters of the top-left andleft neighboring blocks (136 a, 132 a, shown circled 140 in FIG. 7). Thehorizontal gradient is computed from the coded block parameters of thetop-left and top neighboring blocks (136 a, 130 a, shown circled 142 inFIG. 7).

[0067]FIG. 8 is a flow diagram illustrating the steps for finding thepredictor. First, the encoder finds the vertical and horizontalgradients. Each is computed as the exclusive OR of the coded blockparameters shown circled in FIG. 7 ( 140 is the vertical gradient and142 is the horizontal gradient). Next, the encoder compares the gradientvalues. If the gradients are not the same, the encoder selects thepredictor as the value assigned to the block in the direction of thelower gradient. In the example shown in FIG. 7, the vertical gradient iszero, while the horizontal gradient is one. Thus, the direction of thelower gradient is up. As such, the value of the coded block parameterfor block 134 a is used as the predictor because it is located in the“up” direction relative to the block of interest.

[0068] Whether or not prediction is used to modify the coded blockparameters, the end result is a single variable length code representingall of the coded block parameters for the macroblock. Since I and Pframes are coded differently in the implementation, the decoder treatsthe macroblocks for these frames differently. For P frames, the decoderuses VLC table 1 to look up the single variable length code and find thecorresponding entry that stores the combined parameter representing thecoded block parameters for luminance and chrominance. For I frames, thedecoder uses VLC table 2 to look up the single variable length code andfind the corresponding entry that stores the combined parameterrepresenting coded block parameters for luminance and chrominance. Inboth I and P frames, the texture decoding module (block 74 in FIG. 3)uses the coded block parameters to determine whether the texture datafor the corresponding block needs to be decoded. The decoder skipstexture decoding for blocks having a coded block parameter of zero.

[0069] In cases where the coded block parameters are also predicted, thedecoder uses the previously decoded block parameters from theneighboring blocks to compute the coded block parameter for the currentblock of interest. First, the decoder computes the location of thepredictor block based on the spatial gradients in the same manner as inthe encoder. Next, it computes the value of the coded block parameterfor the current block by computing the exclusive OR of the decoded valueand the coded block parameter of the predictor block (the exclusive ORoperator has the following property: X XOR Y=Z; Z XOR X=Y). After thisinverse prediction stage, the texture decoder then uses the coded blockparameter to determine whether to skip decoding the texture for theblock.

[0070] Brief Overview of a Computer System

[0071]FIG. 9 and the following discussion are intended to provide abrief, general description of a suitable computing environment in whichthe invention may be implemented. Although the invention or aspects ofit may be implemented in a hardware device, the encoder and decoderdescribed above are implemented in computer-executable instructionsorganized in program modules. The program modules include the routines,programs, objects, components, and data structures that perform thetasks and implement the data types described above.

[0072] While FIG. 9 shows a typical configuration of a desktop computer,the invention may be implemented in other computer systemconfigurations, including hand-held devices, multiprocessor systems,microprocessor-based or programmable consumer electronics,minicomputers, mainframe computers, and the like. The invention may alsobe used in distributed computing environments where tasks are performedby remote processing devices that are linked through a communicationsnetwork. In a distributed computing environment, program modules may belocated in both local and remote memory storage devices.

[0073]FIG. 9 illustrates an example of a computer system that serves asan operating environment for the invention. The computer system includesa personal computer 920, including a processing unit 921, a systemmemory 922, and a system bus 923 that interconnects various systemcomponents including the system memory to the processing unit 921. Thesystem bus may comprise any of several types of bus structures includinga memory bus or memory controller, a peripheral bus, and a local bususing a bus architecture such as PCI, VESA, Microchannel (MCA), ISA andEISA, to name a few. The system memory includes read only memory (ROM)924 and random access memory (RAM) 925. A basic input/output system 926(BIOS), containing the basic routines that help to transfer informationbetween elements within the personal computer 920, such as duringstart-up, is stored in ROM 924. The personal computer 920 furtherincludes a hard disk drive 927, a magnetic disk drive 928, e.g., to readfrom or write to a removable disk 929, and an optical disk drive 930,e.g., for reading a CD-ROM disk 931 or to read from or write to otheroptical media. The hard disk drive 927, magnetic disk drive 928, andoptical disk drive 930 are connected to the system bus 923 by a harddisk drive interface 332, a magnetic disk drive interface 933, and anoptical drive interface 934, respectively. The drives and theirassociated computer-readable media provide nonvolatile storage of data,data structures, computer-executable instructions (program code such asdynamic link libraries, and executable files), etc. for the personalcomputer 920. Although the description of computer-readable media aboverefers to a hard disk, a removable magnetic disk and a CD, it can alsoinclude other types of media that are readable by a computer, such asmagnetic cassettes, flash memory cards, digital video disks, Bernoullicartridges, and the like.

[0074] A number of program modules may be stored in the drives and RAM925, including an operating system 935, one or more application programs936, other program modules 937, and program data 938. A user may entercommands and information into the personal computer 920 through akeyboard 940 and pointing device, such as a mouse 942. Other inputdevices (not shown) may include a microphone, joystick, game pad,satellite dish, scanner, or the like. These and other input devices areoften connected to the processing unit 921 through a serial portinterface 946 that is coupled to the system bus, but may be connected byother interfaces, such as a parallel port, game port or a universalserial bus (USB). A monitor 947 or other type of display device is alsoconnected to the system bus 923 via an interface, such as a displaycontroller or video adapter 948. In addition to the monitor, personalcomputers typically include other peripheral output devices (not shown),such as speakers and printers.

[0075] The personal computer 920 may operate in a networked environmentusing logical connections to one or more remote computers, such as aremote computer 949. The remote computer 949 may be a server, a router,a peer device or other common network node, and typically includes manyor all of the elements described relative to the personal computer 920,although only a memory storage device 950 has been illustrated in FIG.9. The logical connections depicted in FIG. 9 include a local areanetwork (LAN) 951 and a wide area network (WAN) 952. Such networkingenvironments are commonplace in offices, enterprise-wide computernetworks, intranets and the Internet.

[0076] When used in a LAN networking environment, the personal computer920 is connected to the local network 951 through a network interface oradapter 953. When used in a WAN networking environment, the personalcomputer 920 typically includes a modem 954 or other means forestablishing communications over the wide area network 952, such as theInternet. The modem 954, which may be internal or external, is connectedto the system bus 923 via the serial port interface 946. In a networkedenvironment, program modules depicted relative to the personal computer920, or portions thereof, may be stored in the remote memory storagedevice. The network connections shown are merely examples and othermeans of establishing a communications link between the computers may beused.

[0077] Conclusion

[0078] While the invention has been illustrated using specificimplementation as an example, the scope of the invention is not limitedto the specific implementation. For example, it is possible to usespatial prediction for both chrominance and luminance blocks usingsimilar techniques. In addition, spatial prediction may be used forcoding the coded block parameters for both intra and predicted frames.The implementation uses Huffman tables to generate variable lengthcodes. In fact, a variety of entropy coding methods may be used togenerate a variable length code for each combined coded block parameter.For instance, various forms of arithmetic and/or run length encoding maybe used. Each of these coding methods assign longer codes to inputsignals that occur less frequently while assigning shorter coders tomore frequent input signals. As noted above, the coding methods forimproving the efficiency of macroblock headers can be applied toframe-bas,ed and object based coding methods.

[0079] In view of the many possible implementations of the invention, itshould be recognized that the implementation described above is only anexample of the invention and should not be taken as a limitation on thescope of the invention. Rather, the scope of the invention is defined bythe following claims. We therefore claim as our invention all that comeswithin the scope and spirit of these claims.

We claim:
 1. In a video coder for coding video images in a block format,a method for improving compression of the video images comprising: for amacroblock in a video frame, determining whether texture values for thecolor values of the macroblock are coded and setting the coded blockparameters corresponding to the colors to indicate whether or not thetexture values are coded; forming a combined parameter representing allof the coded block parameters for the macroblock; determining a singlevariable length code for the combined parameter of the macroblock; andrepeating the above-steps for macroblocks in the video image.
 2. Themethod of claim 1 wherein: the texture values are chrominace values Uand V, and luminance values Y, the macroblock includes one block for U,one block for V and four blocks for Y; and the coded block parametersinclude one bit each for U and V indicating whether the corresponding Uand V blocks are coded, and four bits for Y indicating whether the fourcorresponding Y blocks are coded.
 3. The method of claim 1 wherein theforming step includes forming a combined parameter representing all ofthe coded block parameters and a parameter representing macroblock typefor the macroblock.
 4. The method of claim 1 further including:selecting a predictor for the coded block parameters; and computing anexclusive OR between the predictor and the coded block parameters tocompute predicted coded block parameters, the predicted coded blockparameters forming at least a part of the combined coded block parameterfor the macroblock; wherein the step of determining the single variablelength code includes looking up the combined coded block parameter in avariable length coding table to find the single variable length code forthe combined coded block parameter.
 5. The method of claim 4 wherein thetexture values are chrominace values U and V, and luminance values Y,the macroblock includes one block for U, one block for V and four blocksfor Y; and the coded block parameters include one bit each for U and Vindicating whether the corresponding U and V blocks are coded, and fourbits for Y indicating whether the four corresponding Y blocks are coded;and the step of selecting a predictor includes computing a predictorblock for each of the four Y blocks.
 6. The method of claim 4 whereinthe texture values are chrominace values U and V, and luminance valuesY, the macroblock includes one block for U, one block for V and fourblocks for Y; and the coded block parameters include one bit each for Uand V indicating whether the corresponding U and V blocks are coded, andfour bits for Y indicating whether the four corresponding Y blocks arecoded; and the step of selecting a predictor includes computing apredictor block for the U and V blocks.
 7. The method of claim 4 whereinthe step of selecting a predictor includes: computing a horizontalgradient of coded block parameters for neighboring blocks, positionedadjacent each other in a horizontal direction; computing a verticalgradient of coded block parameters for neighboring blocks, positionedadjacent each other in a vertical direction; determining whether thegradient is smaller in the vertical or the horizontal direction; andselecting the neighboring block in the direction of the smaller gradientas the predictor for the block.
 8. The method of claim 1 furtherincluding: selecting a predictor for at least a first coded blockparameter; and computing a predicted value representing a change invalue between the predictor and the first coded block parameter, whereinthe combined parameter includes the predicted value.
 9. The method ofclaim 1 wherein the video image comprises two or more video objectplanes, each being divided into macroblocks, and the steps of claim 1are repeated for the macroblocks of each of the video object planes. 10.A computer readable medium on which is stored instructions forperforming the steps of claim
 1. 11. In a video decoder, a method fordecoding a macroblock comprising: receiving a variable length coderepresenting a combined coded block parameter for the macroblockrepresenting all coded block parameters for the macroblock; looking upthe variable length code in a variable length coding table to find acorresponding entry for the variable length code representing thecombined coded block parameter; and using flags encoded in the combinedcoded block parameter to determine whether texture is coded for blockscorresponding to each flag.
 12. The method of claim 11 wherein a firstvariable length coding table is used for macroblocks in intra frames inan image sequence, and a second variable length coding table is used formacroblocks in predicted image frames.
 13. The method of claim 12wherein the first variable length coding table stores entries forvariable length codes, each representing a combined macroblock parameterthat includes coded block patterns for chrominance and luminance; andwherein the second variable length coding table stores entries forvariable length codes, each representing a combined macroblock parameterthat includes coded block patterns for chrominance and luminance. 14.The method of claim 13 wherein the combined macroblock parameters in thefirst table also include a parameter representing macroblock type. 15.The method of claim 11 wherein at least one of the coded blockparameters in the combined coded block parameters is a spatiallypredicted coded block parameter; and further including: after looking upthe variable length code in the variable length coding table, computinga predictor block among neighboring blocks of a block corresponding tothe spatially predicted coded block parameter; and computing a codedblock parameter value for the block from the spatially predicted codedblock parameter and a coded block parameter for the predictor block. 16.The method of claim 15 wherein the step of computing the predictor blockincludes: computing spatial gradients of coded block parameter betweenpairs of neighboring blocks; and selecting a block in a direction of alowest spatial gradient as the predictor block.
 17. The method of claim16 wherein computing the coded block parameter for the block includes:computing the exclusive OR of the spatially predicted coded blockparameter and the coded block parameter for the predictor block.
 18. Themethod of claim 11 wherein the combined coded block parameter representscoded block parameters for each luminance block and each chrominanceblock in the macroblock.
 19. A computer readable medium on which isstored instructions for performing the method of claim
 11. 20. Acomputer readable medium on which is stored an encoded video framesequence comprising: intra-frame macroblocks, each intra-frame codedmacroblock including a variable length code representing a combinedparameter including a coded block parameter for each luminance block andeach chrominance block in the macroblock; predicted frame macroblocks,each predicted frame coded macroblock including a variable length coderepresenting a combined parameter including a coded block parameter foreach luminance block and each chrominance block in the macroblock;wherein at least one of the coded block parameters is spatiallypredicted from a neighboring block before being formed into the combinedcoded block parameter for a corresponding macroblock.